AITopics | input language

Collaborating Authors

input language

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When Language Shapes Thought: Cross-Lingual Transfer of Factual Knowledge in Question Answering

Kang, Eojin, Kim, Juae

arXiv.org Artificial IntelligenceNov-11-2025

Multilingual large language models (LLMs) offer promising opportunities for cross-lingual information access, yet their use of factual knowledge remains highly sensitive to the input language. Prior work has addressed this through English prompting and evaluation, assuming that English-based reasoning is universally beneficial. In this work, we challenge that assumption by exploring factual knowledge transfer from non-English to English through the lens of Language and Thought Theory. We introduce Language-to-Thought (L2T) prompting, which aligns the model's internal ''thinking'' language with the source of knowledge. Across three languages and four models, L2T consistently outperforms English-based reasoning, reversing the expected advantage of English prompts. Our code is available at https://github.com/GeomeunByeol/Language2Thought.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3746252.3760807

2505.24409

Country:

North America > United States (1.00)
Europe (0.93)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers

Dumas, Clément, Wendler, Chris, Veselovsky, Veniamin, Monea, Giovanni, West, Robert

arXiv.org Artificial IntelligenceJan-9-2025

A central question in multilingual language modeling is whether large language models (LLMs) develop a universal concept representation, disentangled from specific languages. In this paper, we address this question by analyzing latent representations (latents) during a word translation task in transformer-based LLMs. We strategically extract latents from a source translation prompt and insert them into the forward pass on a target translation prompt. By doing so, we find that the output language is encoded in the latent at an earlier layer than the concept to be translated. Building on this insight, we conduct two key experiments. First, we demonstrate that we can change the concept without changing the language and vice versa through activation patching alone. Second, we show that patching with the mean over latents across different languages does not impair and instead improves the models' performance in translating the concept. Our results provide evidence for the existence of language-agnostic concept representations within the investigated models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.08745

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

Wang, Peidong, Xue, Jian, Li, Jinyu, Chen, Junkun, Subramanian, Aswin Shanmugam

arXiv.org Artificial IntelligenceJun-11-2024

Language-agnostic many-to-one end-to-end speech translation models can convert audio signals from different source languages into text in a target language. These models do not need source language identification, which improves user experience. In some cases, the input language can be given or estimated. Our goal is to use this additional language information while preserving the quality of the other languages. We accomplish this by introducing a simple and effective linear input network. The linear input network is initialized as an identity matrix, which ensures that the model can perform as well as, or better than, the original model. Experimental results show that the proposed method can successfully enhance the specified language, while keeping the language-agnostic ability of the many-to-one ST models.

lin, speech recognition, st model, (15 more...)

arXiv.org Artificial Intelligence

2406.10276

Country: North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Do Llamas Work in English? On the Latent Language of Multilingual Transformers

Wendler, Chris, Veselovsky, Veniamin, Monea, Giovanni, West, Robert

arXiv.org Artificial IntelligenceJun-8-2024

We ask whether multilingual language models trained on unbalanced, English-dominated corpora use English as an internal pivot language -- a question of key importance for understanding how language models function and the origins of linguistic bias. Focusing on the Llama-2 family of transformer models, our study uses carefully constructed non-English prompts with a unique correct single-token continuation. From layer to layer, transformers gradually map an input embedding of the final prompt token to an output embedding from which next-token probabilities are computed. Tracking intermediate embeddings through their high-dimensional space reveals three distinct phases, whereby intermediate embeddings (1) start far away from output token embeddings; (2) already allow for decoding a semantically correct next token in the middle layers, but give higher probability to its version in English than in the input language; (3) finally move into an input-language-specific region of the embedding space. We cast these results into a conceptual model where the three phases operate in "input space", "concept space", and "output space", respectively. Crucially, our evidence suggests that the abstract "concept space" lies closer to English than to other languages, which may have important consequences regarding the biases held by multilingual language models.

layer layer layer, probability, translation, (13 more...)

arXiv.org Artificial Intelligence

2402.10588

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Finland (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Automated Verification of Equivalence Properties in Advanced Logic Programs -- Bachelor Thesis

Heuer, Jan

arXiv.org Artificial IntelligenceNov-25-2023

With the increase in industrial applications using Answer Set Programming, the need for formal verification tools, particularly for critical applications, has also increased. During the program optimisation process, it would be desirable to have a tool which can automatically verify whether an optimised subprogram can replace the original subprogram. Formally this corresponds to the problem of verifying the strong equivalence of two programs. In order to do so, the translation tool anthem was developed. It can be used in conjunction with an automated theorem prover for classical logic to verify that two programs are strongly equivalent. With the current version of anthem, only the strong equivalence of positive programs with a restricted input language can be verified. This is a result of the translation $\tau^*$ implemented in anthem that produces formulas in the logic of here-and-there, which coincides with classical logic only for positive programs. This thesis extends anthem in order to overcome these limitations. First, the transformation $\sigma^*$ is presented, which transforms formulas from the logic of here-and-there to classical logic. A theorem formalises how $\sigma^*$ can be used to express equivalence in the logic of here-and-there in classical logic. Second, the translation $\tau^*$ is extended to programs containing pools. Another theorem shows how $\sigma^*$ can be combined with $\tau^*$ to express the strong equivalence of two programs in classical logic. With $\sigma^*$ and the extended $\tau^*$, it is possible to express the strong equivalence of logic programs containing negation, simple choices, and pools. Both the extended $\tau^*$ and $\sigma^*$ are implemented in a new version of anthem. Several examples of logic programs containing pools, negation, and simple choice rules, which the new version of anthem can translate to classical logic, are presented. Some a...

classical logic, logic, strong equivalence, (11 more...)

arXiv.org Artificial Intelligence

2310.19806

Country: Europe > Germany > Brandenburg > Potsdam (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

What Makes a Language Easy to Deep-Learn?

Galke, Lukas, Ram, Yoav, Raviv, Limor

arXiv.org Artificial IntelligenceSep-22-2023

Neural networks drive the success of natural language processing. A fundamental property of language is its compositional structure, allowing humans to produce forms for new meanings systematically. However, unlike humans, neural networks notoriously struggle with systematic generalization, and do not necessarily benefit from compositional structure in emergent communication simulations. This poses a problem for using neural networks to simulate human language learning and evolution, and suggests crucial differences in the biases of the different learning systems. Here, we directly test how neural networks compare to humans in learning and generalizing different input languages that vary in their degree of structure. We evaluate the memorization and generalization capabilities of a pre-trained language model GPT-3.5 (analagous to an adult second language learner) and recurrent neural networks trained from scratch (analaogous to a child first language learner). Our results show striking similarities between deep neural networks and adult human learners, with more structured linguistic input leading to more systematic generalization and to better convergence between neural networks and humans. These findings suggest that all the learning systems are sensitive to the structure of languages in similar ways with compositionality being advantageous for learning. Our findings draw a clear prediction regarding children's learning biases, as well as highlight the challenges of automated processing of languages spoken by small communities. Notably, the similarity between humans and machines opens new avenues for research on language learning and evolution.

generalization, input language, similarity, (15 more...)

arXiv.org Artificial Intelligence

2302.12239

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.55)
Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

Huber, Christian, Ugan, Enes Yavuz, Waibel, Alexander

arXiv.org Artificial IntelligenceNov-9-2022

We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance. With increasing globalization, multiple languages are increasingly used interchangeably during fluent speech. Such CS complicates traditional speech recognition and translation, as we must recognize which language was spoken first and then apply a language-dependent recognizer and subsequent translation component to generate the desired target language output. Such a pipeline introduces latency and errors. In this paper, we eliminate the need for that, by treating speech recognition and translation as one unified end-to-end speech translation problem. By training LAST with both input languages, we decode speech into one target language, regardless of the input language. LAST delivers comparable recognition and speech translation accuracy in monolingual usage, while reducing latency and error rate considerably when CS is observed.

aber bevor ich ihnen zeige, artificial intelligence, natural language, (9 more...)

arXiv.org Artificial Intelligence

2210.01512

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

The Effect of Efficient Messaging and Input Variability on Neural-Agent Iterated Language Learning

Lian, Yuchen, Bisazza, Arianna, Verhoef, Tessa

arXiv.org Artificial IntelligenceApr-15-2021

Natural languages commonly display a trade-off among different strategies to convey constituent roles. A similar trade-off, however, has not been observed in recent simulations of iterated language learning with neural network based agents (Chaabouni et al., 2019b). In this work, we re-evaluate this result in the light of two important factors, namely: the lack of effort-based pressure in the agents and the lack of variability in the initial input language.

agent, input language, utterance, (13 more...)

arXiv.org Artificial Intelligence

2104.07637

Country:

Europe > Sweden > Uppsala County > Uppsala (0.05)
Europe > Netherlands > South Holland > Leiden (0.05)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Testing the Quantitative Spacetime Hypothesis using Artificial Narrative Comprehension (II) : Establishing the Geometry of Invariant Concepts, Themes, and Namespaces

Burgess, Mark

arXiv.org Artificial IntelligenceSep-23-2020

Given a pool of observations selected from a sensor stream, input data can be robustly represented, via a multiscale process, in terms of invariant concepts, and themes. Applying this to episodic natural language data, one may obtain a graph geometry associated with the decomposition, which is a direct encoding of spacetime relationships for the events. This study contributes to an ongoing application of the Semantic Spacetime Hypothesis, and demonstrates the unsupervised analysis of narrative texts using inexpensive computational methods without knowledge of linguistics. Data streams are parsed and fractionated into small constituents, by multiscale interferometry, in the manner of bioinformatic analysis. Fragments may then be recombined to construct original sensory episodes---or form new narratives by a chemistry of association and pattern reconstruction, based only on the four fundamental spacetime relationships. There is a straightforward correspondence between bioinformatic processes and this cognitive representation of natural language. Features identifiable as `concepts' and `narrative themes' span three main scales (micro, meso, and macro). Fragments of the input act as symbols in a hierarchy of alphabets that define new effective languages at each scale.

artificial intelligence, natural language, text processing, (20 more...)

arXiv.org Artificial Intelligence

2010.08125

Country:

North America > United States > New York (0.04)
South America > Colombia (0.04)
North America > United States > Illinois (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

eclingo: A solver for Epistemic Logic Programs

Cabalar, Pedro, Fandinno, Jorge, Garea, Javier, Romero, Javier, Schaub, Torsten

arXiv.org Artificial IntelligenceAug-5-2020

We describe eclingo, a solver for epistemic logic programs under Gelfond 1991 semantics built upon the Answer Set Programming system clingo. The input language of eclingo uses the syntax extension capabilities of clingo to define subjective literals that, as usual in epistemic logic programs, allow for checking the truth of a regular literal in all or in some of the answer sets of a program. The eclingo solving process follows a guess and check strategy. It first generates potential truth values for subjective literals and, in a second step, it checks the obtained result with respect to the cautious and brave consequences of the program. This process is implemented using the multi-shot functionalities of clingo. We have also implemented some optimisations, aiming at reducing the search space and, therefore, increasing eclingo's efficiency in some scenarios. Finally, we compare the efficiency of eclingo with two state-of-the-art solvers for epistemic logic programs on a pair of benchmark scenarios and show that eclingo generally outperforms their obtained results. Under consideration for acceptance in TPLP.

artificial intelligence, eclingo, logic & formal reasoning, (16 more...)

arXiv.org Artificial Intelligence

2008.02018

Country:

Europe > Spain (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback